Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 15926 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.6 MiB |
| Average record size in memory | 104.0 B |
Variable types
| Categorical | 1 |
|---|---|
| Numeric | 12 |
Geography[0] is highly correlated with Geography[1] | High correlation |
Geography[1] is highly correlated with Geography[0] | High correlation |
Geography[0] is highly correlated with Geography[1] and 1 other fields | High correlation |
Geography[1] is highly correlated with Geography[0] | High correlation |
Geography[2] is highly correlated with Geography[0] | High correlation |
Geography[0] is highly correlated with Geography[1] | High correlation |
Geography[1] is highly correlated with Geography[0] | High correlation |
Exited is highly correlated with Geography[0] and 5 other fields | High correlation |
Geography[0] is highly correlated with Exited and 5 other fields | High correlation |
Geography[1] is highly correlated with Geography[0] and 5 other fields | High correlation |
Geography[2] is highly correlated with Geography[0] and 4 other fields | High correlation |
Gender is highly correlated with Exited and 5 other fields | High correlation |
Age is highly correlated with Exited | High correlation |
Balance is highly correlated with Geography[1] | High correlation |
NumOfProducts is highly correlated with Exited | High correlation |
HasCrCard is highly correlated with Exited and 5 other fields | High correlation |
IsActiveMember is highly correlated with Exited and 5 other fields | High correlation |
Exited is uniformly distributed | Uniform |
Geography[0] has 7293 (45.8%) zeros | Zeros |
Geography[1] has 10166 (63.8%) zeros | Zeros |
Geography[2] has 11399 (71.6%) zeros | Zeros |
Gender has 6386 (40.1%) zeros | Zeros |
Tenure has 425 (2.7%) zeros | Zeros |
Balance has 5097 (32.0%) zeros | Zeros |
HasCrCard has 3471 (21.8%) zeros | Zeros |
IsActiveMember has 7195 (45.2%) zeros | Zeros |
Reproduction
| Analysis started | 2023-04-14 12:29:29.948192 |
|---|---|
| Analysis finished | 2023-04-14 12:30:05.190147 |
| Duration | 35.24 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 124.5 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 15926 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 7963 | |
| 0 | 7963 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 7963 | |
| 1 | 7963 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 7963 | |
| 1 | 7963 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 15926 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 7963 | |
| 1 | 7963 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 15926 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 7963 | |
| 1 | 7963 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15926 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 7963 | |
| 1 | 7963 |
Geography[0]
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 2416 |
|---|---|
| Distinct (%) | 15.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4663632586 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 7293 |
| Zeros (%) | 45.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0.2860500817 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.4726111121 |
|---|---|
| Coefficient of variation (CV) | 1.013396968 |
| Kurtosis | -1.892849872 |
| Mean | 0.4663632586 |
| Median Absolute Deviation (MAD) | 0.2860500817 |
| Skewness | 0.1336832634 |
| Sum | 7427.301256 |
| Variance | 0.2233612633 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 7293 | |
| 1 | 6219 | |
| 0.8980458875 | 1 | < 0.1% |
| 0.4617514384 | 1 | < 0.1% |
| 0.6903215962 | 1 | < 0.1% |
| 0.02148767685 | 1 | < 0.1% |
| 0.8725479125 | 1 | < 0.1% |
| 0.3076273581 | 1 | < 0.1% |
| 0.4132179225 | 1 | < 0.1% |
| 0.5258710185 | 1 | < 0.1% |
| Other values (2406) | 2406 | 15.1% |
| Value | Count | Frequency (%) |
| 0 | 7293 | |
| 0.0003343308747 | 1 | < 0.1% |
| 0.0007480543662 | 1 | < 0.1% |
| 0.001475630066 | 1 | < 0.1% |
| 0.001908650055 | 1 | < 0.1% |
| 0.001912720145 | 1 | < 0.1% |
| 0.002081496148 | 1 | < 0.1% |
| 0.002139095346 | 1 | < 0.1% |
| 0.003239827356 | 1 | < 0.1% |
| 0.003336577413 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 6219 | |
| 0.9998912366 | 1 | < 0.1% |
| 0.9992778277 | 1 | < 0.1% |
| 0.9990543689 | 1 | < 0.1% |
| 0.9982214576 | 1 | < 0.1% |
| 0.9980400877 | 1 | < 0.1% |
| 0.9976914389 | 1 | < 0.1% |
| 0.9973095264 | 1 | < 0.1% |
| 0.9970379897 | 1 | < 0.1% |
| 0.9966740454 | 1 | < 0.1% |
Geography[1]
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 1840 |
|---|---|
| Distinct (%) | 11.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.3042879937 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 10166 |
| Zeros (%) | 63.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0.9656332262 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 0.9656332262 |
Descriptive statistics
| Standard deviation | 0.4384928776 |
|---|---|
| Coefficient of variation (CV) | 1.44104561 |
| Kurtosis | -1.188705293 |
| Mean | 0.3042879937 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.8475897461 |
| Sum | 4846.090587 |
| Variance | 0.1922760037 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 10166 | |
| 1 | 3922 | 24.6% |
| 0.1603990829 | 1 | < 0.1% |
| 0.5523056212 | 1 | < 0.1% |
| 0.1921526118 | 1 | < 0.1% |
| 0.6588088692 | 1 | < 0.1% |
| 0.279988366 | 1 | < 0.1% |
| 0.229821086 | 1 | < 0.1% |
| 0.9748928683 | 1 | < 0.1% |
| 0.03081876028 | 1 | < 0.1% |
| Other values (1830) | 1830 | 11.5% |
| Value | Count | Frequency (%) |
| 0 | 10166 | |
| 0.0002332899006 | 1 | < 0.1% |
| 0.0009456311327 | 1 | < 0.1% |
| 0.001778542398 | 1 | < 0.1% |
| 0.001959912305 | 1 | < 0.1% |
| 0.002308561131 | 1 | < 0.1% |
| 0.00269047356 | 1 | < 0.1% |
| 0.002962010296 | 1 | < 0.1% |
| 0.00309559146 | 1 | < 0.1% |
| 0.003325954618 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 3922 | |
| 0.9996656691 | 1 | < 0.1% |
| 0.9985243699 | 1 | < 0.1% |
| 0.9980913499 | 1 | < 0.1% |
| 0.9980872799 | 1 | < 0.1% |
| 0.9979185039 | 1 | < 0.1% |
| 0.9978609047 | 1 | < 0.1% |
| 0.9967601726 | 1 | < 0.1% |
| 0.9966634226 | 1 | < 0.1% |
| 0.9963866027 | 1 | < 0.1% |
| Distinct | 1738 |
|---|---|
| Distinct (%) | 10.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2293487477 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 11399 |
| Zeros (%) | 71.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0.3135179074 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 0.3135179074 |
Descriptive statistics
| Standard deviation | 0.3980756202 |
|---|---|
| Coefficient of variation (CV) | 1.735678194 |
| Kurtosis | -0.2288172497 |
| Mean | 0.2293487477 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.282459674 |
| Sum | 3652.608157 |
| Variance | 0.1584641994 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 11399 | |
| 1 | 2791 | 17.5% |
| 0.6562082508 | 1 | < 0.1% |
| 0.4099738838 | 1 | < 0.1% |
| 0.3109137239 | 1 | < 0.1% |
| 0.3249207881 | 1 | < 0.1% |
| 0.2213075862 | 1 | < 0.1% |
| 0.3240288682 | 1 | < 0.1% |
| 0.6910685806 | 1 | < 0.1% |
| 0.386480749 | 1 | < 0.1% |
| Other values (1728) | 1728 | 10.9% |
| Value | Count | Frequency (%) |
| 0 | 11399 | |
| 0.0001087634214 | 1 | < 0.1% |
| 0.0007221723404 | 1 | < 0.1% |
| 0.00362340138 | 1 | < 0.1% |
| 0.004933463653 | 1 | < 0.1% |
| 0.004979153326 | 1 | < 0.1% |
| 0.00520473104 | 1 | < 0.1% |
| 0.005253660856 | 1 | < 0.1% |
| 0.005364541103 | 1 | < 0.1% |
| 0.006319723142 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 2791 | |
| 0.9997667101 | 1 | < 0.1% |
| 0.9992519456 | 1 | < 0.1% |
| 0.9969044085 | 1 | < 0.1% |
| 0.9963161349 | 1 | < 0.1% |
| 0.9962904753 | 1 | < 0.1% |
| 0.994967311 | 1 | < 0.1% |
| 0.9947310462 | 1 | < 0.1% |
| 0.9945726614 | 1 | < 0.1% |
| 0.9938933768 | 1 | < 0.1% |
CreditScore
Real number (ℝ≥0)
| Distinct | 6371 |
|---|---|
| Distinct (%) | 40.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 648.0902579 |
| Minimum | 350 |
|---|---|
| Maximum | 850 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 350 |
|---|---|
| 5-th percentile | 495.6645322 |
| Q1 | 584.863021 |
| median | 649 |
| Q3 | 711.4769639 |
| 95-th percentile | 801.4548831 |
| Maximum | 850 |
| Range | 500 |
| Interquartile range (IQR) | 126.6139429 |
Descriptive statistics
| Standard deviation | 91.78988081 |
|---|---|
| Coefficient of variation (CV) | 0.1416313232 |
| Kurtosis | -0.3082183663 |
| Mean | 648.0902579 |
| Median Absolute Deviation (MAD) | 63.10019517 |
| Skewness | -0.07305499603 |
| Sum | 10321485.45 |
| Variance | 8425.382219 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 850 | 234 | 1.5% |
| 678 | 63 | 0.4% |
| 655 | 54 | 0.3% |
| 705 | 53 | 0.3% |
| 667 | 53 | 0.3% |
| 651 | 53 | 0.3% |
| 684 | 52 | 0.3% |
| 670 | 50 | 0.3% |
| 683 | 48 | 0.3% |
| 652 | 48 | 0.3% |
| Other values (6361) | 15218 |
| Value | Count | Frequency (%) |
| 350 | 5 | |
| 351 | 1 | < 0.1% |
| 358 | 1 | < 0.1% |
| 358.1963875 | 1 | < 0.1% |
| 359 | 1 | < 0.1% |
| 363 | 1 | < 0.1% |
| 365 | 1 | < 0.1% |
| 366.1762271 | 1 | < 0.1% |
| 367 | 1 | < 0.1% |
| 370.7579864 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 850 | 234 | |
| 849.994731 | 1 | < 0.1% |
| 849.4774051 | 1 | < 0.1% |
| 849.4689962 | 1 | < 0.1% |
| 849.1516499 | 1 | < 0.1% |
| 849 | 8 | 0.1% |
| 848.5877308 | 1 | < 0.1% |
| 848.1479705 | 1 | < 0.1% |
| 848 | 5 | < 0.1% |
| 847.7498587 | 1 | < 0.1% |
| Distinct | 2929 |
|---|---|
| Distinct (%) | 18.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5057863927 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 6386 |
| Zeros (%) | 40.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0.5334116911 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.4680177967 |
|---|---|
| Coefficient of variation (CV) | 0.9253269829 |
| Kurtosis | -1.890298452 |
| Mean | 0.5057863927 |
| Median Absolute Deviation (MAD) | 0.4665883089 |
| Skewness | -0.02186211022 |
| Sum | 8055.15409 |
| Variance | 0.2190406581 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 6613 | |
| 0 | 6386 | |
| 0.8589904465 | 1 | < 0.1% |
| 0.8209349412 | 1 | < 0.1% |
| 0.7326750542 | 1 | < 0.1% |
| 0.9092127531 | 1 | < 0.1% |
| 0.8320657506 | 1 | < 0.1% |
| 0.1445849973 | 1 | < 0.1% |
| 0.7200112063 | 1 | < 0.1% |
| 0.8395355017 | 1 | < 0.1% |
| Other values (2919) | 2919 |
| Value | Count | Frequency (%) |
| 0 | 6386 | |
| 0.0007221723404 | 1 | < 0.1% |
| 0.0007480543662 | 1 | < 0.1% |
| 0.001134526331 | 1 | < 0.1% |
| 0.001475630066 | 1 | < 0.1% |
| 0.001951219389 | 1 | < 0.1% |
| 0.001959912305 | 1 | < 0.1% |
| 0.002048504939 | 1 | < 0.1% |
| 0.002308561131 | 1 | < 0.1% |
| 0.00362340138 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 6613 | |
| 0.9996656691 | 1 | < 0.1% |
| 0.9988972932 | 1 | < 0.1% |
| 0.998891001 | 1 | < 0.1% |
| 0.9984132521 | 1 | < 0.1% |
| 0.9982108381 | 1 | < 0.1% |
| 0.9981313651 | 1 | < 0.1% |
| 0.9980913499 | 1 | < 0.1% |
| 0.9980559951 | 1 | < 0.1% |
| 0.9979194046 | 1 | < 0.1% |
| Distinct | 5826 |
|---|---|
| Distinct (%) | 36.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 41.16053007 |
| Minimum | 18 |
|---|---|
| Maximum | 92 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 26 |
| Q1 | 34 |
| median | 40 |
| Q3 | 47.38349676 |
| 95-th percentile | 59 |
| Maximum | 92 |
| Range | 74 |
| Interquartile range (IQR) | 13.38349676 |
Descriptive statistics
| Standard deviation | 10.10249507 |
|---|---|
| Coefficient of variation (CV) | 0.2454413257 |
| Kurtosis | 0.4603700809 |
| Mean | 41.16053007 |
| Median Absolute Deviation (MAD) | 6.90210794 |
| Skewness | 0.5510800769 |
| Sum | 655522.6018 |
| Variance | 102.0604066 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 37 | 488 | 3.1% |
| 38 | 481 | 3.0% |
| 35 | 479 | 3.0% |
| 36 | 459 | 2.9% |
| 40 | 454 | 2.9% |
| 34 | 449 | 2.8% |
| 33 | 449 | 2.8% |
| 39 | 425 | 2.7% |
| 32 | 422 | 2.6% |
| 31 | 404 | 2.5% |
| Other values (5816) | 11416 |
| Value | Count | Frequency (%) |
| 18 | 22 | 0.1% |
| 19 | 27 | 0.2% |
| 20 | 40 | |
| 20.40124662 | 1 | < 0.1% |
| 21 | 53 | |
| 21.09944232 | 1 | < 0.1% |
| 22 | 84 | |
| 22.00316576 | 1 | < 0.1% |
| 22.90132076 | 1 | < 0.1% |
| 23 | 99 |
| Value | Count | Frequency (%) |
| 92 | 2 | |
| 88 | 1 | < 0.1% |
| 85 | 1 | < 0.1% |
| 84 | 2 | |
| 83 | 1 | < 0.1% |
| 82 | 1 | < 0.1% |
| 81 | 4 | |
| 80 | 3 | |
| 79.77549414 | 1 | < 0.1% |
| 79 | 4 |
| Distinct | 5402 |
|---|---|
| Distinct (%) | 33.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.009653933 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 425 |
| Zeros (%) | 2.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 5 |
| Q3 | 7 |
| 95-th percentile | 9 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.714424876 |
|---|---|
| Coefficient of variation (CV) | 0.5418388001 |
| Kurtosis | -1.056335205 |
| Mean | 5.009653933 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.01933717954 |
| Sum | 79783.74854 |
| Variance | 7.368102407 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 1106 | 6.9% |
| 2 | 1101 | 6.9% |
| 8 | 1078 | 6.8% |
| 5 | 1072 | 6.7% |
| 4 | 1065 | 6.7% |
| 3 | 1061 | 6.7% |
| 7 | 1057 | 6.6% |
| 9 | 1053 | 6.6% |
| 6 | 1013 | 6.4% |
| 10 | 504 | 3.2% |
| Other values (5392) | 5816 |
| Value | Count | Frequency (%) |
| 0 | 425 | |
| 0.01029127854 | 1 | < 0.1% |
| 0.01600688158 | 1 | < 0.1% |
| 0.01944004914 | 1 | < 0.1% |
| 0.02001946448 | 1 | < 0.1% |
| 0.02163496298 | 1 | < 0.1% |
| 0.03921786045 | 1 | < 0.1% |
| 0.05463956226 | 1 | < 0.1% |
| 0.05960168887 | 1 | < 0.1% |
| 0.09629537466 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 10 | 504 | |
| 9.99704874 | 1 | < 0.1% |
| 9.994146342 | 1 | < 0.1% |
| 9.986390746 | 1 | < 0.1% |
| 9.981366438 | 1 | < 0.1% |
| 9.979460798 | 1 | < 0.1% |
| 9.972003873 | 1 | < 0.1% |
| 9.970946534 | 1 | < 0.1% |
| 9.961588058 | 1 | < 0.1% |
| 9.959738488 | 1 | < 0.1% |
| Distinct | 10828 |
|---|---|
| Distinct (%) | 68.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 81696.39741 |
| Minimum | 0 |
|---|---|
| Maximum | 250898.09 |
| Zeros | 5097 |
| Zeros (%) | 32.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 102991.935 |
| Q3 | 128829.7087 |
| 95-th percentile | 164851.9231 |
| Maximum | 250898.09 |
| Range | 250898.09 |
| Interquartile range (IQR) | 128829.7087 |
Descriptive statistics
| Standard deviation | 61307.63871 |
|---|---|
| Coefficient of variation (CV) | 0.7504325852 |
| Kurtosis | -1.355026414 |
| Mean | 81696.39741 |
| Median Absolute Deviation (MAD) | 37905.78283 |
| Skewness | -0.2778562518 |
| Sum | 1301096825 |
| Variance | 3758626564 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 5097 | |
| 105473.74 | 2 | < 0.1% |
| 130170.82 | 2 | < 0.1% |
| 153856.4069 | 1 | < 0.1% |
| 134796.87 | 1 | < 0.1% |
| 123509.6717 | 1 | < 0.1% |
| 113905.48 | 1 | < 0.1% |
| 110132.55 | 1 | < 0.1% |
| 161027.5753 | 1 | < 0.1% |
| 109899.251 | 1 | < 0.1% |
| Other values (10818) | 10818 |
| Value | Count | Frequency (%) |
| 0 | 5097 | |
| 639.477236 | 1 | < 0.1% |
| 3768.69 | 1 | < 0.1% |
| 9818.952477 | 1 | < 0.1% |
| 12100.21062 | 1 | < 0.1% |
| 12449.86985 | 1 | < 0.1% |
| 12459.19 | 1 | < 0.1% |
| 14262.8 | 1 | < 0.1% |
| 15614.9833 | 1 | < 0.1% |
| 16214.69884 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 250898.09 | 1 | |
| 240966.8874 | 1 | |
| 238387.56 | 1 | |
| 236158.8925 | 1 | |
| 235473.8143 | 1 | |
| 224387.7382 | 1 | |
| 222267.63 | 1 | |
| 221532.8 | 1 | |
| 221317.9704 | 1 | |
| 220803.9643 | 1 |
| Distinct | 2825 |
|---|---|
| Distinct (%) | 17.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.509791874 |
| Minimum | 1 |
|---|---|
| Maximum | 4 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1.032883895 |
| Q3 | 2 |
| 95-th percentile | 2.681874665 |
| Maximum | 4 |
| Range | 3 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.6103640999 |
|---|---|
| Coefficient of variation (CV) | 0.4042703571 |
| Kurtosis | 0.9593841808 |
| Mean | 1.509791874 |
| Median Absolute Deviation (MAD) | 0.03288389526 |
| Skewness | 1.052813248 |
| Sum | 24044.94538 |
| Variance | 0.3725443345 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 7902 | |
| 2 | 4787 | |
| 3 | 348 | 2.2% |
| 4 | 68 | 0.4% |
| 2.063038814 | 1 | < 0.1% |
| 1.157557266 | 1 | < 0.1% |
| 1.242377783 | 1 | < 0.1% |
| 2.855871295 | 1 | < 0.1% |
| 1.234221896 | 1 | < 0.1% |
| 1.479535443 | 1 | < 0.1% |
| Other values (2815) | 2815 | 17.7% |
| Value | Count | Frequency (%) |
| 1 | 7902 | |
| 1.000945631 | 1 | < 0.1% |
| 1.00095298 | 1 | < 0.1% |
| 1.001164404 | 1 | < 0.1% |
| 1.002139095 | 1 | < 0.1% |
| 1.002205414 | 1 | < 0.1% |
| 1.003239827 | 1 | < 0.1% |
| 1.003336577 | 1 | < 0.1% |
| 1.003623401 | 1 | < 0.1% |
| 1.003683865 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 4 | 68 | |
| 3.990645777 | 1 | < 0.1% |
| 3.979460798 | 1 | < 0.1% |
| 3.976585309 | 1 | < 0.1% |
| 3.972526693 | 1 | < 0.1% |
| 3.972049658 | 1 | < 0.1% |
| 3.968723398 | 1 | < 0.1% |
| 3.958733854 | 1 | < 0.1% |
| 3.954808958 | 1 | < 0.1% |
| 3.951544294 | 1 | < 0.1% |
| Distinct | 2552 |
|---|---|
| Distinct (%) | 16.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.7019447133 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 3471 |
| Zeros (%) | 21.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.2008734728 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 0.7991265272 |
Descriptive statistics
| Standard deviation | 0.426975975 |
|---|---|
| Coefficient of variation (CV) | 0.6082757899 |
| Kurtosis | -1.096652254 |
| Mean | 0.7019447133 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | -0.8764191325 |
| Sum | 11179.1715 |
| Variance | 0.1823084832 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 9905 | |
| 0 | 3471 | 21.8% |
| 0.9748928683 | 1 | < 0.1% |
| 0.9131683915 | 1 | < 0.1% |
| 0.06133948778 | 1 | < 0.1% |
| 0.1824114254 | 1 | < 0.1% |
| 0.03721557966 | 1 | < 0.1% |
| 0.8396786472 | 1 | < 0.1% |
| 0.3474044248 | 1 | < 0.1% |
| 0.3601133759 | 1 | < 0.1% |
| Other values (2542) | 2542 | 16.0% |
| Value | Count | Frequency (%) |
| 0 | 3471 | |
| 0.0002332899006 | 1 | < 0.1% |
| 0.0005822022221 | 1 | < 0.1% |
| 0.0007480543662 | 1 | < 0.1% |
| 0.001475630066 | 1 | < 0.1% |
| 0.001778542398 | 1 | < 0.1% |
| 0.001868634948 | 1 | < 0.1% |
| 0.001951219389 | 1 | < 0.1% |
| 0.002551575533 | 1 | < 0.1% |
| 0.003605827164 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 9905 | |
| 0.9996206421 | 1 | < 0.1% |
| 0.9980559951 | 1 | < 0.1% |
| 0.9978609047 | 1 | < 0.1% |
| 0.9967601726 | 1 | < 0.1% |
| 0.9964787649 | 1 | < 0.1% |
| 0.9962904753 | 1 | < 0.1% |
| 0.9960062631 | 1 | < 0.1% |
| 0.9956779929 | 1 | < 0.1% |
| 0.9950208467 | 1 | < 0.1% |
| Distinct | 2812 |
|---|---|
| Distinct (%) | 17.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4595847911 |
| Minimum | 0 |
|---|---|
| Maximum | 1 |
| Zeros | 7195 |
| Zeros (%) | 45.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0.2787612672 |
| Q3 | 1 |
| 95-th percentile | 1 |
| Maximum | 1 |
| Range | 1 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.467535181 |
|---|---|
| Coefficient of variation (CV) | 1.017299071 |
| Kurtosis | -1.86798537 |
| Mean | 0.4595847911 |
| Median Absolute Deviation (MAD) | 0.2787612672 |
| Skewness | 0.1604349596 |
| Sum | 7319.347384 |
| Variance | 0.2185891455 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 7195 | |
| 1 | 5921 | |
| 0.4711775428 | 1 | < 0.1% |
| 0.2892947946 | 1 | < 0.1% |
| 0.5501166663 | 1 | < 0.1% |
| 0.8578062853 | 1 | < 0.1% |
| 0.8354803797 | 1 | < 0.1% |
| 0.5854867048 | 1 | < 0.1% |
| 0.4146394203 | 1 | < 0.1% |
| 0.6460799869 | 1 | < 0.1% |
| Other values (2802) | 2802 | 17.6% |
| Value | Count | Frequency (%) |
| 0 | 7195 | |
| 0.0003793578812 | 1 | < 0.1% |
| 0.0007803252961 | 1 | < 0.1% |
| 0.0009456311327 | 1 | < 0.1% |
| 0.001908650055 | 1 | < 0.1% |
| 0.001912720145 | 1 | < 0.1% |
| 0.001944004914 | 1 | < 0.1% |
| 0.002048504939 | 1 | < 0.1% |
| 0.002139095346 | 1 | < 0.1% |
| 0.00269047356 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 5921 | |
| 0.9997667101 | 1 | < 0.1% |
| 0.9996656691 | 1 | < 0.1% |
| 0.9995235099 | 1 | < 0.1% |
| 0.9994177978 | 1 | < 0.1% |
| 0.9988972932 | 1 | < 0.1% |
| 0.9984132521 | 1 | < 0.1% |
| 0.9982214576 | 1 | < 0.1% |
| 0.9980487806 | 1 | < 0.1% |
| 0.9980400877 | 1 | < 0.1% |
EstimatedSalary
Real number (ℝ≥0)
| Distinct | 15925 |
|---|---|
| Distinct (%) | > 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 100449.7306 |
| Minimum | 11.58 |
|---|---|
| Maximum | 199992.48 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 124.5 KiB |
Quantile statistics
| Minimum | 11.58 |
|---|---|
| 5-th percentile | 9953.618414 |
| Q1 | 51218.40217 |
| median | 100598.4511 |
| Q3 | 149941.7706 |
| 95-th percentile | 190284.0885 |
| Maximum | 199992.48 |
| Range | 199980.9 |
| Interquartile range (IQR) | 98723.36846 |
Descriptive statistics
| Standard deviation | 57515.84687 |
|---|---|
| Coefficient of variation (CV) | 0.5725833858 |
| Kurtosis | -1.18931804 |
| Mean | 100449.7306 |
| Median Absolute Deviation (MAD) | 49370.70606 |
| Skewness | -0.002124005136 |
| Sum | 1599762410 |
| Variance | 3308072641 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 24924.92 | 2 | < 0.1% |
| 15217.57095 | 1 | < 0.1% |
| 160162.42 | 1 | < 0.1% |
| 172988.8657 | 1 | < 0.1% |
| 65069.03 | 1 | < 0.1% |
| 82774.07 | 1 | < 0.1% |
| 165693.06 | 1 | < 0.1% |
| 24495.03 | 1 | < 0.1% |
| 60407.93 | 1 | < 0.1% |
| 76700.93643 | 1 | < 0.1% |
| Other values (15915) | 15915 |
| Value | Count | Frequency (%) |
| 11.58 | 1 | |
| 87.78040195 | 1 | |
| 90.07 | 1 | |
| 91.75 | 1 | |
| 96.27 | 1 | |
| 106.67 | 1 | |
| 123.07 | 1 | |
| 142.81 | 1 | |
| 143.34 | 1 | |
| 178.19 | 1 |
| Value | Count | Frequency (%) |
| 199992.48 | 1 | |
| 199970.74 | 1 | |
| 199953.33 | 1 | |
| 199929.17 | 1 | |
| 199909.32 | 1 | |
| 199862.75 | 1 | |
| 199857.47 | 1 | |
| 199841.32 | 1 | |
| 199808.1 | 1 | |
| 199805.63 | 1 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| Exited | Geography[0] | Geography[1] | Geography[2] | CreditScore | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 1.0 | 0.0 | 0.0 | 619.0 | 0.0 | 42.0 | 2.0 | 0.00 | 1.0 | 1.0 | 1.0 | 101348.88 |
| 1 | 0 | 0.0 | 0.0 | 1.0 | 608.0 | 0.0 | 41.0 | 1.0 | 83807.86 | 1.0 | 0.0 | 1.0 | 112542.58 |
| 2 | 1 | 1.0 | 0.0 | 0.0 | 502.0 | 0.0 | 42.0 | 8.0 | 159660.80 | 3.0 | 1.0 | 0.0 | 113931.57 |
| 3 | 0 | 1.0 | 0.0 | 0.0 | 699.0 | 0.0 | 39.0 | 1.0 | 0.00 | 2.0 | 0.0 | 0.0 | 93826.63 |
| 4 | 0 | 0.0 | 0.0 | 1.0 | 850.0 | 0.0 | 43.0 | 2.0 | 125510.82 | 1.0 | 1.0 | 1.0 | 79084.10 |
| 5 | 1 | 0.0 | 0.0 | 1.0 | 645.0 | 1.0 | 44.0 | 8.0 | 113755.78 | 2.0 | 1.0 | 0.0 | 149756.71 |
| 6 | 0 | 1.0 | 0.0 | 0.0 | 822.0 | 1.0 | 50.0 | 7.0 | 0.00 | 2.0 | 1.0 | 1.0 | 10062.80 |
| 7 | 1 | 0.0 | 1.0 | 0.0 | 376.0 | 0.0 | 29.0 | 4.0 | 115046.74 | 4.0 | 1.0 | 0.0 | 119346.88 |
| 8 | 0 | 1.0 | 0.0 | 0.0 | 501.0 | 1.0 | 44.0 | 4.0 | 142051.07 | 2.0 | 0.0 | 1.0 | 74940.50 |
| 9 | 0 | 1.0 | 0.0 | 0.0 | 684.0 | 1.0 | 27.0 | 2.0 | 134603.88 | 1.0 | 1.0 | 1.0 | 71725.73 |
Last rows
| Exited | Geography[0] | Geography[1] | Geography[2] | CreditScore | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 15916 | 1 | 0.000000 | 1.000000 | 0.0 | 567.650643 | 0.000000 | 33.960696 | 2.000000 | 115557.840616 | 1.288209 | 0.096070 | 0.903930 | 112615.951847 |
| 15917 | 1 | 0.998221 | 0.001779 | 0.0 | 659.964429 | 0.000000 | 47.991107 | 0.016007 | 90052.693680 | 1.998221 | 0.001779 | 0.998221 | 187598.569720 |
| 15918 | 1 | 0.041960 | 0.958040 | 0.0 | 779.398503 | 0.041960 | 52.622363 | 1.083919 | 81761.178289 | 1.000000 | 1.000000 | 0.000000 | 182841.899725 |
| 15919 | 1 | 0.000000 | 1.000000 | 0.0 | 746.429950 | 0.781697 | 61.232612 | 5.345092 | 112023.540743 | 1.000000 | 1.000000 | 0.218303 | 35964.861659 |
| 15920 | 1 | 1.000000 | 0.000000 | 0.0 | 571.511799 | 0.000000 | 41.500193 | 9.250290 | 0.000000 | 1.750097 | 0.249903 | 0.000000 | 170023.960197 |
| 15921 | 1 | 0.737346 | 0.262654 | 0.0 | 662.696311 | 1.000000 | 49.212039 | 5.313268 | 150334.344730 | 2.474693 | 1.000000 | 0.737346 | 124075.996213 |
| 15922 | 1 | 0.000000 | 0.000000 | 1.0 | 594.007651 | 0.000000 | 40.319051 | 2.334353 | 0.000000 | 1.000000 | 0.667177 | 1.000000 | 130144.261430 |
| 15923 | 1 | 0.227177 | 0.772823 | 0.0 | 677.732419 | 0.772823 | 44.135887 | 1.772823 | 167502.069193 | 1.000000 | 0.227177 | 0.000000 | 117569.480952 |
| 15924 | 1 | 0.947615 | 0.052385 | 0.0 | 723.342439 | 0.052385 | 61.161843 | 4.104770 | 140577.109999 | 1.000000 | 0.947615 | 0.000000 | 31178.255036 |
| 15925 | 1 | 0.000000 | 1.000000 | 0.0 | 630.403511 | 0.000000 | 52.807016 | 7.070176 | 114077.010238 | 1.000000 | 1.000000 | 0.535088 | 107400.827473 |